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Description 

BACKGROUND OF THE INVENTION 

5 Field of the Invention 

This invention relates to a signal processing apparatus or system carrying out signal processing with the 
use of a so-called neural network made up of a plurality of units each taking charge of signal processing cor- 
responding to that of a neuron, and a learning processing apparatus or system causing a signal processing 
10 section by said neural network to undergo a learning processing in accordance with the learning rule of back 
propagation. 

Prior Art 

15 The learning rule of back propagation, which is a learning algorithm of the neural network, has been ten- 
tatively applied to signal processing, including high speed image processing or pattern recognition, as dis- 
closed in "parallel Distributed Processing", vol. 1, The MIT Press, 1986 or "Nikkei Electronics, issue of August 
10, 1987, No. 427, pp 115 to 124. The learning rule of back propagation is also applied, as shown in Fig. 1, to 
a multistorey neural network having an intermediate layer 2 between an inputlayer 1 and an output layer 3. 

20 Each unit uj of the neural network shown in Fig. 1 issues an output value which is the total sum netj of 

output values O, of units u, coupled to the unit uj by coupling coefficients Wj|, transformed by a predetermined 
function f, such as a sigmoid function. That is, when the value of a pattern p is supplied as an input value to 
each unit Uj of the input layer 1, an output value O pJ of each unit Uj of the intermediate layer 2 and the output 
layer 3 is expressed by the following formula (1) 

°pj = fj(net p j) 

= f j( % Wji-Opji) ..... (1) 

The output value O p j of the unit Uj of the output layer 3 may be obtained by sequentially computing the 
output values of the units uj, each corresponding to a neuron, from the input layer 1 towards the output layer 
3. 

In accordance with the back-propagation learning algorithm, the processing of learning consisting in mod- 
35 ifying the coupling coefficient Wj, so as to minimize the total sum E p of square errors between the actual output 
value Opj of each unit Uj of the output layer 3 on application of the pattern £ and the desirable output value tpj, 
that is the teacher signal, 



Ep; -U(t pj - 0pj >2 (2, 

is sequentially performed from the output layer 3 towards the input layer 1. By such processing of learning, 
45 the output value O PJ closest to the value t p] of the teacher sig nal is output from the unit uj of the output layer 3. 
If the variant A Wj, of the coupling coefficient W,, which minimizes the total sum E p of the square errors is 
set so that 

AWjioc - SEp/dWj, (3) 

the formula (3) may be rewritten to 
50 AW,, = Ti-Sp, Opj (4) 

as explained in detail in the above reference materials. 

In the above formula (4), ri stands for the rate of learning, which is a constant, and which may be empirically 
determined from the number of the units or layers or from the input or output values. 8 pJ stands for the error 
proper to the unituj. 

55 Therefore, in determining the above variant A Wj,, it suffices to compute the error 8 pj in the reverse direction, 
or from the output layer towards the input layer of the network. 

The error 8 pJ of the unit Uj of the output layer 1 is given by the formula (5) 
8 PJ = (t pj - Opjjfjfnelj) (5) 
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On the other hand, the error 8 pJ of the unit Uj of the intermediate layer 2 may be computed by a recurrent function 
of the following formula (6) 

£>pj = f'j (netj ) 2& k W kj ..... (6) 



using the error 8 pk and the coupling coefficient W kJ of each unit u k coupled to the unit uj, herein each unit of 
the output layer 3. The process of finding the above formulas (5) and (6) is explained in detail in the above 
10 reference materials. 

In the above formulas, f'j(netj) stands for the differentiation of the output function fj(netj). 
Although the variant AWj, may be found from the above formula (4), using the results of the formulas (5) 
and (6), more stable results may be obtained by finding it from the following formula (7) 
AW JI(n + 1) = n-SpjOp, + a-AW JI(n) (7) 
15 with the use of the results of the preceding learning. In the above formula, ct stands for a stabilization factor 
for reducing the error oscillations and accelerating the convergence thereof. 

The above described learning is repeated until it is terminated at the time point when the total sum E p of 
the square errors between the output value O pJ and the teacher signal t pj becomes sufficiently small. 

It is noted that, in the conventional signal processing system in which the aforementioned back- 
20 propagation learning rule is applied to the neural network, the learning constant is empirically determined from 
the numbers of the layers and the units corresponding to neurons or the input and output values, and the learn- 
ing is carried out at the constant learning rate using the above formula (7). Thus the number of times of rep- 
etition n of the learning until the total sum E p of the square errors between the output value 0 pj and the teacher 
signal t pj becomes small enough to terminate the learning may be enormous to render the efficient learning 
25 unfeasible. 

Also, the above described signal processing system is constructed as a network consisting only of feed- 
forward couplings between the units corresponding to the neurons, so that, when the features of the input sig- 
nal pattern are to be extracted by learning the coupling state of the above mentioned network from the input 
signals and the teacher signal, it is difficult to extract the sequential time series pattern or chronological pattern 

30 of the audio signals fluctuating on the time axis. 

In addition, while the processing of learning of the above described multistorey neural network in accor- 
dance with the back-propagation learning rule has a promisingly high functional ability, it may occur frequently 
that an optimum global minimum is not reached, but only a local minimum is reached, in the course of the learn- 
ing process, such that the total sum E p of the square errors cannot be reduced sufficiently. 

35 Conventionally, when such local minimum is reached, the initial value of the learning rate t) is changed 
and the processing of learning is repeated until finding the optimum global minimum. This results in consid- 
erable fluctuations and protractions of the learning processing time. 

Objects of the Invention 

40 

It is a primary object of the present invention to provide a signal processing system in which the number 
of times of repetition of learning until termination of learning may be reduced to realize a more efficient learn- 
ing. 

It is a second object of the present invention to provide a signal processing system whereby the features 
45 of the sequential time-series patterns of, for example, audio signals, fluctuating on the time axis, may be ex- 
tracted by learning of the coupling states in a network constituted by plural units corresponding to neurons. 

Summary of the Invention 

so For accomplishing the primary object of the present invention, the present invention provides a signal proc- 

essing system according to claim 1. 

For accomplishing the second object, the present invention provides a signal processing system according 
to claim 2. 

The above and other objects and novel features of the present invention will become apparent from the 
55 following detailed description of the invention which is made in conjuction with the accompanying drawings 
and the new matter pointed out in the claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



Fig. 1 is a diagrammatic view showing the general construction of a neural network to which the back- 
propagation learning rule is applied. 
5 Fig. 2 is a block diagram schematically showing the construction of a signal processing system according 

to a first embodiment of the present invention. 

Fig. 3 is a flow chart showing the process of learning processing in the learning processing section con- 
stituting the signal processing system according to the embodiment shown in Fig. 2. 

Fig. 4 is a block diagram schematically showing the construction of a signal processing system according 
10 to a second embodiment of the present invention. 

Fig. 5 is a diagrammatic view of a neural network showing the construction of the signal processing section 
of the signal processing system according to the embodiment shown in Fig. 4. 

Fig. 6 is a flow chart showing the process of learning processing in the learning processing section con- 
stituting the signal processing system of the embodiment shown in Fig. 4. 
15 Fig. 7 is a block diagram schematically showing the construction of a learning processing system in which 

the present invention may be incorporated. 

Figs. 8A and 8B are diagrammatic views showing the state cf the signal processing section at the start 
and in the course of learning processing in the learning processing system shown in Fig. 7. 

Fig. 9 is a flow chart showing a typical process of learning processing in the learning processing section 
20 constituting the learning processing system shown in Fig. 7. 

Fig. 1 0 is a chart showing the typical results of tests of learning processing on the signal processing section 
of the neural network shown in Fig. 5 by the learning processing section of the learning processing system of 
Fig. 7. 

Fig. 11 is a chart showing the results of tests of learning on the signal processing section of the neural 
25 network shown in Fig. 5, with the number of units of the intermediate layer fixed at six. 

Fig. 12 is a chart showing the results of tests of learning on the signal processing system of the neural 
network shown in Fig. 5, with the number of units of the intermediate layer fixed at three. 



DETAILED DESCRIPTION OF THE EMBODIMENTS 

By referring to the drawings, certain preferred embodiments of the present invention will be explained in 
more detail. 

The signal processing system of the present invention includes, a shown schematically in Fig. 2, a signal 
processing section 10 for producing an output value Opj from input signal patterns p and a signal processing 
section 20 for executing learning for producing an output value O pJ closest to the desired output value tpj from 
the input signal patterns p by the signal processing section 10. 

The signal processing section 10 is formed by a neural network including at least an input layer L|, an in- 
termediate layer L H and an output layer Lo. These layers L,, L H and Lo are made up of units u M to u !x , u m to 
u Hy and u 0 i to u 0z , each corresponding to a neuron, wherein x, y and z each represent an arbitrary number. 

Each of the units u M to u, x , u m to UHy and u 0 i to u 0z is designed to issue an output O pj represented by a 
sigmoid function according to the formula (8) 

°" = 1+e-V9 (8) 
for the total sum netj of inputs represented by the formula (9) 



so where 9j stands for a threshold value. 

The learning processing section 20 is fed with a desired output value t p) as a teacher signal for the output 
value O oJ of the output layer l_o for the input signal patterns p entered into the signal processing section 10. 
This learning processing section 20 causes the signal processing section 10 to undergo learning processing 
of the coupling coefficient Wj,, in such a manner that, according to the sequence of steps shown by the flow 

55 chart of Fig. 3, the coefficient Wj, of the coupling strength between the units u M to u,,, u m to UHy and u 0 i to u^ 
is sequentially and repeatedly computed from the output layer Lo towards the input layer L h until the sum of 
the quadratic errors between the desired output value tpj and the actual output value O oJ become sufficiently 
small, in order that the output value O oJ of the output layer l_o will be closest to the desired output value t pj sup- 
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plied as the teacher signal. 

Thus, in step 1, the learning processing section 20 affords the coupling coefficient Wj, to each of the units 
u H1 to uny and u 0 i to u 0z to compute the output value of the output layer Lq for the input signal patterns p 
in the signal processing section 10. In step 2, the section 20 executes decision as to the converging condition 

5 for the actual output value O oJ , on the basis of the total sum E p of the square errors between the actual output 
value O oJ and the desired output value t pj supplied as the teacher signal. 

In the decision step 2, it is decided whether the output value O oJ obtained at the output layer l_o of the signal 
processing section 1 0 is closest to the desired output value t pj . If the result of decision at step 2 is YES, that 
is, when the total sum E p of the square errors becomes sufficiently small and the output value 0 0 j is closest 

10 to the desired output value t pJ , the processing of learning is terminated. If the result of decision is NO, the com- 
putation operations of steps 3 through 6 are executed sequentially. 

In the next computing step 3, the error 6 pj at each of the units u m to Uny and u 0 i to u^ of the signal proc- 
essing section 10 is computed. In the computing operation of step 3, the error 6 0j of each of the units u 0 i to 
Uqz of the output layer l_o is given by the following formula (1 0): 

15 8 OJ = (t PJ - O^l - O^) (10) 

On the other hand, the error 8 HJ of each of the units u H1 to UHy of the intermediate layer L H is given by the following 
formula (11): 



^Hj = °Hj< 1 - 0 H j>S & ok . W kj (11, 

In the next computing step 4, the learning variable ft of the coefficient Wj, of the coupling strength from 
the i'th one to the j'th one of the units u m to u Hy and u 0 i to u 0z is computed as a reciprocal of the square sum 
of the totality of the inputs added to by 1 as the threshold value, that is, in accordance with the following formula 
(12): 

1 



Then, in the computing step 5, the variant AWj, of the coupling coefficient Wj, from the i'th one to the j'th 
one of the units u m to u Hy and u 0 i to u 0z is computed, using the above learning variable fy, in accordance with 
35 the following formula (13) 

AW JI(n + „ = TvP(5pjO p ,) + a:AW JI(n) (13) 
where ti stands for the learning constant and a the stabilization constant for reducing the error oscillations and 
accelerating the convergence thereof. 

Then, in the computing step 6, the coupling coefficient Wji of the units u m to u Hy and u 0 i to u te is modified, 
40 on the basis of the variant AWji of the coupling coefficient Wj, computed at step 5, in accordance with the fol- 
lowing formula (14): 

Wj| = Wj, + AWj| (14) 

Then, revert to step 1 , the output value O oJ of the output layer L 0 for the input patterns p at the signal proc- 
essing section 10 is computed. 
45 The learning processing section 20 executes the above steps 1 through 6 repeatedly, until the learning 
processing is terminated by the decision at step 2 when the total sum E p of the square error between the desired 
output tpj afforded as the teacher signal and the output value O oj becomes sufficiently small and the output 
value O oj obtained at the output layer Lo of the signal processing section 1 0 is closest to the desired output 
value t pj . 

so In this manner, in the signal processing system of the present first embodiment, the learning constant -q 

is normalized by the above learning variable p represented by the reciprocal of the square sum of the input 
value Op, at each of the units u H i to UHy and u 0 i to u 0z added to by 1 as the threshold value. This causes the 
learning rate to be changed dynamically as a function of the input value O p! . By performing the learning proc- 
essing of the coupling coefficient Wj, with the learning rate changed dynamically in this manner as a function 

55 of the input value O p! , it becomes possible to reduce the number n of times of learning significantly to one fourth 
to one tenth of that in the case of the conventional learning processing. 

It is noted that, by representing the learning constant t\ and the stabilizing constant a in the formula 13 as 
the function of the maximum error E max for the input patterns as a whole, as shown by the formulas (1 5) and 

5 
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(16): 

t, = aE max (15) 
a = - b E max + c (16) 

where a, b and c are constants, and by changing them dynamically, it becomes possible to perform faster learn- 
5 ing processing. 

According to the above described first embodiment of the signal processing system, the learning constant 
n is normalized by the learning variable p represented by the reciprocal of a square sum of the actual input 
Op, in each unit added to by 1 as a threshold value to cause the learning rate to be changed dynamically in 
accordance with the input value O p! to execute the learning processing of the coupling coefficient Wj, so that 
10 it becomes possible to perform stable and fast learning. 

A second illustrative of the signal processing system according to the present invention will be hereinafter 
explained. 

As shown schematically in Fig. 4, the signal processing system of the present illustrative embodiment in- 
cludes a signal processing section 30 for obtaining the output value 0 pj from the input signal patterns p and a 
15 learning processing section 40 for causing the signal processing section 30 to undergo learning to obtain the 
output value O p j closest to the desired output value tpj from the input signal patterns p. 

The signal processing section 30 is formed, as shown in Fig. 5 , by a neural network of a three-layer struc- 
. ture including at least an input layer L h an intermediate layer L H and an output layer Lo. These layers L,, L H 
and Lo are constituted by units u M to U| X , u H i to u Hy and u 01 to u 0z . each corresponding to a neuron, respectively, 
20 where x, y and z stand for arbitrary numbers. Each of the units u m to u Hy and u 0 i to u 0z of the intermediate 
layer Lh and the output layer l_o is provided with delay means and forms a recurrent network including a loop 
LP having its output 0 J(t) as its own input by way of the delay means and a feedback FB having its output 0 JW 
as an input to another unit. 

In the signal processing system 30, with the input signal-patterns p entered into each of the units u M to u ix 
25 of the input layer L h the total sum netj of the inputs to the units u H i to u^ of the intermediate layer L H is given 
by the following formula (17): 

X NI 

= e ?=0 WjX * k+e ° ie(t - k) 

y nh 

+ e ?=i Wjy * k+i ° hi ( fc - k ' 

35 

2 NO 

+ i" j^i W 3 2 * k+i °oi(t-k) 

40 + ©j (17) 

Each of the units u H i to u^ of the intermediate layer issues, for the total sum netj of the input signals, an 
output value 0 HJ (t) represented by the sigmoid function of the following formula (1 8): 

°™ = TTe^ < 18 > 

The total sum netj of the inputs to the units u 0 i to u Qz of the output layer L 0 is given by the following formula 
(19): 



55 
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x NH 

net i = f ^ 0 w ix*k + i o H i(t-k) 

z NO 

+ f k^1 WjZ * k+i ° Hi(t - k ) 

+ 9 j (19) 

while each of the units u Q i to u 0z of the output layer l_o issues, for the total sum netj of the inputs, an output 
value O oj(t) represented by the following formula (20): 

°ci(.) = TTe^ (20) 

where 0j stands for a threshold value and Nl, NH and NO stand for the numbers of the delay means 
provided in the layers L h L H and l_o, respectively. 

The learning processing section 40 computes the coefficient Wj, of coupling strength between the units 
u 0 i to Uoz, u H1 to u Hy and u M to u, x , from the output layer l_o towards the input layer L|, sequentially and repeat- 
edly, according to the sequence shown in the flow chart of Fig. 6, while executing the learning processing of 
the coupling coefficient Wji so that the total sum of the square errors LMS between the desired output value 
t pj afforded as the teacher signal and the output value 0 0j of the output layer Lq will be sufficiently small. By 
such learning processing, the learning processing section 40 causes the output value O oj of the output layer 
L 0 to be closest to the desired output value t^, afforded as the teacher signal patterns, for an input signal pattern 
p (xr) supplied to the signal processing section 30. This pattern p^ represents an information unit as a whole 
which fluctuates along the time axis and represented by the xr number of data, where r stands for the number 
of times of sampling of the information unit and x the number data in each sample. 

That is, the section 40 affords at step 1 the input signal patterns p (XI) to each of the units u n to U| X of the 
input layer L h and proceeds to computing at step 2 each output value Opj (t ) of each of the units u m to u Hy and 
u 0 i to u 0z of the intermediate layer L H and the output layer Lq. 

The section 40 then proceeds to computing at step 3 the error 5 pj of each of the units u 0 i to u 0z and u m 
to UHy,.f rom the output layer Lq towards the input layer L,, on the basis of the output values O PJ ( t) and the desired 
output value t zr afforded as the teacher signal. 

In the computing step 3, the error 5 0j of each of the units u 0 i to u 0z of the output layer Lo is given by the 
following formula (21): 

5 OJ = (tpj - O^)0 oj (1 - O oJ ) (21) 
wherein the error 8 pJ of each of the units u H i to u Hy of the intermediate layer L H is given by the following formula 
(22): 

&Hj. = °H3 n ~ °Hj»f ^ok W kj < 22 > 

Then, in step 4, the learning variable Pj of the coefficient Wj, of coupling strength from the i'th one to the 
j'th one of the units u M to u, x , u m to u Hy and u 0 i to u 0z is computed by the following formula (23) 



% °pi + 1 



in which the learning variable Pj is represented by the reciprocal of the square sum of the input values added 
to by 1 as a threshold value. 

Then in step 5, using the learning variable Pj computed in step 4, the variant AWj, of the coupling coefficient 
wj, from the i'th one to the j'th one of the units u 01 to u te , u H1 to u Hy and u M to u, x is computed in accordance 
with the following formula (24) : 

Aw„ (n) = TvP<5 Pj Op,). (24) 
7 
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in the formula, r\ stands for a learning constant. 

Then, in step 5, the total sum LMS of the square errors of the units with respect to the teacher signal is 
computed in accordance with the formula (25) 

LMS =5. S (t bi - 0 i) (25) . 

p=1 i=1 • 

Then, in step 6, it is decided whether the processing of the steps 1 through 5 has been performed on the 
10 R-number of input signal patterns p xr . If the result of decision at step 6 is NO, the section 40 reverts to step 1 . 
When the result of decision at step 6 is YES, that is, when all of the variants A Wj, of the coupling coefficient 
Wji between the units u 0 i to u 0z , u m tou Hy and u M to U| X are computed for the inputsignal patterns p^, the section 
40 proceeds to step 7 to execute decision of the converging condition for the output value 0 0j obtained at the 
output layer l_o on the basis of the total sum LMS of square errors between the output value 0 0j and the desired 
15 output value t pj afforded as the teacher signal. 

In the decision step 7, it is decided whether the output value 0 6J obtained at the output layer Lq of the signal 
processing section 30 is closest to the desired output value t ri afforded as the teacher signal. When the result 
of decision at step 7 is YES, that is, when the total sum LMS of the square errors is sufficiently small and the 
output value O oJ is closest to the desired output value t^, the learning processing is terminated. If the result of 
20 decision at step 7 is NO, the section 40 proceeds to computing at step 8. 

In this computing step 8, the coupling coefficient Wj, between the units u 01 to u 0z , u m to u Hy and u M to U| X 
is modified, on the basis of the variant AWji of the coupling coefficient Wj, computed at step 5, in accordance 
with the following formula (26) 

AWji(n) = AW JHn) + oAWj, (n . n (26) 

25 and the following formula (27) 

W JI(n + 1) = W„ (n) + AW JI(n) (27). 
After the computing step 8, the section 40 reverts to step 1 to execute the operation of steps 1 to 6. 
Thus the section 40 executes the operations of the steps 1 to 8 repeatedly and, when the total sum LMS 
of the square errors between the desired output value and the actual output value O oJ becomes sufficiently 
30 small and the output value O oJ obtained at the output layer l_o of the signal processing section 30 is closest to 
the desired output value t pj afforded as the teacher signal, terminates the processing of learning by thedecision 
at step 7. 

In this manner, in the present second embodiment of the signal processing system, the learning as to the 
coupling coefficient W„ between the units u 0 i to U 0z , u m to U Hy and u M to u, x of the signal processing section 

35 30 constituting the recurrent network inclusive of the above mentioned loop LP and the feedback FB is exe- 
cuted by the learning processing section 40 on the basis of the desired output value t pj afforded as the teacher 
signal. Hence, the features of the sequential time-base input signal pattern p xr , such as audio signals, fluctu- 
ating along the time axis, may also be extracted reliably by the learning processing by the learning processing 
section 40. Thus, by setting the coupling state between the units u 0 i to u 0z , u H i to u Hy and u h to u, x of the signal 

40 processing section 30 by the coupling coefficient W,,, obtained as the result of learning by the learning proc- 
essing section 40, the time-series inputsignal pattern p xr can be subjected to desired signal processing by the 
signal processing section 30. 

Moreover, in the second illustrative embodiment of the present invention, similarly to the previously de- 
scribed first embodiment, the learning constant ri is normalized by the learning variable p indicated as the re- 

45 ciprocal of the square sum of the input values at the units u H1 to u^ and u 01 to uoz, and the learning processing 
as to the coupling coefficient Wj, is performed at the dynamically changing learning rate, as a function of the 
input value O ph so that learning can be performed stably and expeditiously with a small number of times of 
learning. 

in this manner, in the present second embodiment of the signal processing system, signal processing for 
so input signals is performed at the signal processing section 30 in which the recurrent network inclusive of the 
loop LP and the feedback FB is constituted by the units u H i to u Hy and u 0 i to u 0z of the intermediate layer L H 
and the output layer Lq each provided with delay means. In the learning processing section 40, the learning 
as to the coupling state of the recurrent network by the units u m to u Hy and u 0 i to u 0z constituting the signal 
processing section 30 is executed on the basis of the teacher signal. Thus the features of the sequential time- 
55 base patterns, fluctuating along the time axis, such as audio signals, can be extracted by the above mentioned 
learning processing section to subject the signal processing section to the desired signal processing. 

A preferred illustrative embodiment learning processing system in which the present invention can be in- 
corporated will be hereinafter explained. 

8 
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The basic construction of this learning processing system is shown in Fig. 7. As show therein, the system 
includes a signal processing section 50 constituted by a neural network of a three-layered structure including 
at least an input layer L,, an intermediate layer L H and an output layer Lo, each made up of plural units per- 
• forming a signal processing corresponding to one of a neuron, and a learning processing section 60 subjecting 

5 the learning processing to the signal processing consisting in sequentially repeatedly computing the coefficient 
Wj, of coupling strength between the above units from the output layer Lo towards the input layer L, on the 
basis of the error data 8 pj between the output value of the output layer L 0 and the desired output value 0 pj af- 
forded as the teacher signal tpj, for the input signal patterns p entered into the input layer L| of the signal proc- 
essing section 50, and learning the coupling coefficient Wyin accordance with the back-propagation learning 

10 rule. 

The learning processing section 60 executes the learning processing of the coupling coefficient Wji as it 
causes the number of the units of the intermediate layer L H of the signal processing section 50 to be increased, 
and thus the section 60 has the control function of causing the number of units of the intermediate layer L H to 
be increased in the course of learning processing of the coupling coefficient Wj, The learning processing sec- 
ts tion 60 subjects the signal processing section 50 having the input layer L,, an intermediate layer L H and an 
output layer Lo made up of arbitrary numbers x, y and z of units u M to u, XI u m to u Hy and u 0 i to u 0z > each cor- 
responding to a neuron, respectively, as shown in Fig. 8A, to learning processing as to the coupling coefficient 
Wj,, while the section 60 causes the number of the units L H in the intermediate layer to be increased sequentially 
from y to (y+m), as shown in Fig. 8B. 
20 It is noted that the control operation of increasing the number of the units of the intermediate layer L H may 

be performed periodically in the course of learning processing of the coupling coefficient Wj,, or each time the 
occurrence of the above mentioned local minimum state is sensed. 

The above mentioned learning processing section 60, having the control function of increasing the number 
of the units of the intermediate layer L H in the course of learning processing of the coupling coefficient Wj,, 
25 subjects the signal processing section 50 formed by a neural network of a three-layer structure including the 
input layer L h intermediate layer L H and the output layer Lo to the learning processing of the coupling coefficient 
Wji, as it causes the number of units of the intermediate layer L H to be increased. Thus, even on occurrence 
of the local minimum state in the course of learning of the coupling coefficient Wj,, the section 50 is able to 
increase the number of the units of the intermediate layer L H to exit from such local minimum state to effect 
30 rapid and reliable convergence into the optimum global minimum state. 

Tests were conducted repeatedly, in each of which the learning processing section 50 having the control 
function of increasing the number of units of the intermediate layer in the course of learning of the coupling 
coefficient Wj, causes the signal processing section 60 constituting the recurrent network including the feed- 
back FB and the loop LP in the second embodiment of the signal processing system to undergo the process 
35 of learning the coefficient W Jh with the number of the units of the input layer L, of 8(x=8), that of the output 
layer L 0 of 3(z=3), the number of the delay means of each layer of 2 and with the input signal pattern p during 
learning operation, using 21 time-space patterns of 1=8x7, and the processing algorithm shown in the flow 
chart of Fig. 9, with the learning being started at the number of the units of the intermediate layer L H of 3(y=3) 
and with the number of the units of the intermediate layer L H being increased during the learning process. By 
40 increasing the number of the units of the intermediate layer L H three to five times, the test results were obtained 
in which the convergence to the optimum global minimum state were realized without going into the local mini- 
mum state. 

Fig. 1 0 shows, as an example of the above tests, the test results in which learning processing of converging 
into the optimum minimum state could be achieved by adding the units of the intermediate layer L H atthe timing 
45 shown by the arrow mark in the figure and by increasing the number of units of the intermediate layer L H from 
three to six. The ordinate in Fig. 1 0 stands for the total sum LMS of the quadratic errors and the abscissa stands 
for the number of times of the learning processing operations. 

The processing algorithm shown in the flow chart of Fig. 9 is explained. 

In this processing algorithm, in step 1 , the variable K indicating the number of times of the processing for 
so detecting the local minimum state is initialized to "0", while the first variable Lms for deciding the converging 
condition of the learning processing is also initialized to 10000000000. 

Then, in step 2, the variable n indicating the number of times of learning of the overall learning pattern, 
that is, the l-numberof the input signal patterns rj, is initialized. The program then proceeds to step 3 to execute 
the learning processing of the [-number of the input signal patterns £. 
55 Then, in step 4, decision is made of the variable n indicating the number of times of learning. Unless n=3, 

the program proceeds to step 5 to add one to n (n -> n+1), and then reverts to step 3 to repeat the learning 
processing. When n=3, the program proceeds to step 6. 

In step 6, after the value of the first variable Lms is maintained as the value of the second variable Lms(- 
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1) for deciding the converging condition of the learning processing, the total sum of the square errors between 
the output signal and the teacher signal in each unit is computed in accordance with the formula (28), this value 
being then used as the new value for the first variable Lms, such that 



Litis = £ 5. (t pi - 0 L ) 2 ..... (28). 

n-1 i«l p pi 

10 Then, in step 7, the first variable Lms for deciding the converging condition of the learning processing is 
compared with the second variable Lms(-1 ). If the value of the first variable Lms is lesser than that of the second 
variable Lms(-1), the program proceeds to step 8 to decide whether or not the variable K indicating the number 
of times of the processing operations for detecting the local minimum state is equal to 0. 

If, in step 8, the variable K is 0, the program reverts directly to step 2. If the variable K is not 0, setting of 

15 K K+1 is made in step 9. The program then reverts to step 2 to initialize n to 0(n=0) to execute the learning 
processing of the l-number of the input signal patterns p_ in step 3. 

If, in step 7, the value of the first variable Lms is larger than that of the second variable Lms(-1), the program 
proceeds to step 10 to set the value of K indicating the number of times of the processing operations for de- 
tecting the local minimum state (K K+1). Then, in step 11, it is decided whether or not the value of K is 2. 

20 If, in step 11 , the value of the variable K is not 2, the program reverts directly to step 2. If the variable K 

is 2, it is decided that the local minimum state is prevailing. Thus, in step 12, control is made for increasing 
the number of the units of the intermediate layer L H . Then, in step 13, setting of K=0 is made. The program 
then reverts to step 2 for setting of n=0 and then proceeds to step 3 to execute the learning processing of the 
above mentioned [-number of the input signal patterns p. 

25 Test on the learning processing was conducted of the signal processing section 50 of the above described . 

second embodiment of the signal processing system constituting the recurrent network including the feedback 
loop FB and the loop LP shown in Fig. 5, with the number of the units of the intermediate layer L H being set to 
six (y=6). The test results have revealed that the learning processing need be repeated an extremely large 
number of times with considerable time expenditure until the convergence to the optimum minimum state was 

30 achieved, and that the local minimum state prevailed for three out of eight learning processing tests without 
convergence to the optimum global minimum state. 

Fig. 11 shows, by way of an example, the results of the learning processing tests in which the local mini- 
mum state was reached. 

In this figure, the ordinate stands for the total sum LMS of the square errors and the abscissa stands the 
35 number of times of the learning processing operations. 

Also the tests on the learning processing was conducted 30 times on the signal processing section 50 of 
the above described second embodiment of the signal processing system constituting the recurrent network 
including the feedback loop FB and the loop LP shown in Fig. 5, with the number of the units of the intermediate 
layer L H being set to three (y=3). It was found that, as shown for example in Fig. 12, the local minimum state 
40 was reached in all of the tests on learning processing without convergence to the optimum global minimum 
state. 

In Fig. 12, the ordinate stands for the total sum LMS of the square errors and the abscissa stands the num- 
ber of times of the learning processing operations. 

From the foregoing, it is seen that the present invention can be incorporated in a learning processing sys- 
45 tern in which the learning processing of the coefficient of coupling strength is. performed, while the number of 
the units of the intermediate layer is increased by the learning processing section, whereby the convergence 
to the optimum global minimum state is achieved promptly and reliably to achieve the stable learning proc- 
essing to avoid the local minimum state in the learning processing process conforming to the back-propagation 
learning rule. 

50 

Claims 

1. A signal processing system comprising : 
55 a signal processing section (10; 30) composed of a multi-layer neural network with three layers : 

an input layer L h an intermediate layer L H , and an output layer Lo, the layers being made up of units u M 
to U| X , u H i to u Hy , u 0 i to uqz, respectively, each unit corresponding to a neuron, the network consisting of 
feed-forward couplings between the units, each of the units j in the intermediate layer and in the output 
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layer being designed to issue for an input pattern p entered into the input layer an output signal 0 pj rep- 
resented by a sigmoid function according to the formula : 

O pi = 1/(1 + exp{-(netj + 9j)}) 
for the total sum netj of inputs, where 
9j is a threshold value, 

nelj is the total sum of the inputs to a unit j in the intermediate layer and in the output layer, and 
0 P j is the jth clement of the actual output pattern produced by the representation of the input pattern, 
the system further comprising 

a learning process section (20; 40) executing a learning process using a back-propagating learning 
algorithm, the process consisting in modifying the coupling coefficients Wji of all units j in the intermediate 
layer and in the output layer with a variant DELTA Wj, so as to minimize the total sum of square errors 
between the actual output value O oJ of unit j in layer Lo produced from an input signal pattern and the 
desirable output value t pj (teacher signal) for said unit j in the layer l_o, whereby Wj, is the weight for the 
signal from the ith to the jth unit, 

the learning process section (20; 40) being fed with a desired output value t pJ as a teacher signal 
for the output value O oJ of a unit j in the output layer Lo for the input signal patterns p entered into the input 
layer, 

the learning process section computing the error value for each unit in the output layer and in the 
intermediate layer, 

the error 8 0j of each unit j of the output layer Lo being computed by the formula : 

«oj = (tpj - 0 oj )0 OJ (1 - O 0 j) 
the error 8 HJ of each unit j of the intermediate layer L H being computed by the formula : 
8^ = 0^(1-0^)^(80^) 
the coupling coefficient Wj, of the units in the intermediate layer and in the output layer being given by 
the formula : 

Wji,„ ♦ 1, = W JI(n> + DELTA W JI(n) 
the learning process being executed repeatedly until the total sum E of the square error between the de- 
sired output afforded as the teacher signal and the output signal becomes sufficiently small, 
characterized in that : 

the learning process section (20; 40) computes a learning variable Pj for each coupling coefficient 
W„ of each unit j in the intermediate layer and in the output layer for all of its inputs O, : 
p, = 1/(S(Opfl + 1) 

and in that for all these units in the intermediate layer and in the output layer the variant DELTA 
Wj, of the coupling coefficient Wj, is computed by using said learning variable Pj : 

DELTA W Jl(n) = N ■ Pj(6 PJ . O pi ) + a ■ DELTA W JI(n . „ 
where N is the learning constant, a is the stabilization constant for reducing the error oscillations, and n 
is the number of times of learning. 

2. The signal processing system according to claim 1, wherein each of the units in the intermediate layer 
and in the output layer are also provided with further couplings each provided with delay means so forming 
a recurrent network including : 

a loop LP to provide via said delay means its output Oj as one of its inputs, and 
a feedback path FB to provide its output Oj as an input to another unit in the same layer or in the 
intermediate layer. 

3. The signal processing system according to claim 1 or 2, wherein control means are provided in said learn- 
ing processing section (20; 40) for increasing the numberof the units of said intermediate layer, and where- 
in said learning processing section performs learning processing of the coefficient of coupling strength 
Wj, as said learning processing section causes the number of the units of the intermediate layer to be in- 
creased. 



Patentanspriiche 

1. Signalverarbeitungssystem 

mit einem Signalverarbeitungsabschnitt (10; 30) bestehend aus einem mehrschichtigen neuro- 
nalen Netzwerk mit drei Schichten: einer Eingangsschicht L,, einer Zwischenschicht L H und einer Aus- 
gangsschicht Lo, wobei diese Schichten aus Einheiten u M bis u, x , u m bis UHy bzw. u 0 i bis u 0z bestehen, 
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wobei jede Einheit einem Neuron entspricht, das Netzwerk aus Vorwartskoppelverbindungen zwischen 
den Einheiten besteht, jede der Einheiten j in der Zwischenschicht und in der Ausgangsschicht so ausge- 
bildet ist, daB fur ein in die Eingangschicht eingegebenes Eingangsmuster p ein Ausgangswert 0 pj aus- 
gegeben wird, derdurch eine Sigmoid-Funktion (S-Funktion) nach der Formel 

Op, = 1/(1 + exp{-(netj+ ft)}) 
fur die Gesamtsumme netj von Eingangswerten dargestellt wird, worin 
6j ein Schwellwert, 

net, die Gesamtsumme der Eingangswerte fur eine Einheit j in der Zwischenschicht und in der 

Ausgangsschicht und 

O pJ das j-te Element des durch die Darstellung des Eingangsmusters erzeugten tatsachlichen 

Ausgangsmusters bedeuten, 
wobei das System weiterhin aufweist: 

einenLernprozeBabschnitt(20;40),derunterBenutzungeinessichruckwartsausbreitendenLern- 
algorithmus einen LernprozeB ausfuhrt, der darin besteht, dali die Kopplungskoeffizienten Wj, alter Ein- 
heiten j in der Zwischenschicht und in der Ausgangsschicht mit einer Varianten DELTA Wj, derail modif i- 
ziert werden, daB die Gesamtsumme der quadratischen Fehler zwischen dem von einem Eingangssignal- 
muster erzeugten tatsachlichen Ausgangswert O oJ in der Schicht Lq und dem gewunschten Ausgangswert 
t pj (Lehrsignal) fur die genannten Einheit j in der Schicht Lo zu einem Minimum wird, wobei Wj, die Ge- 
wichtung fur das Signal aus der i-ten Schicht zu der j-ten Schicht bedeutet, 
, wobei 

der LernprozeBabschnitt (20; 40) mit einem gewunschten Ausgangswert t pj als Lehrsignal fur den 
Ausgangswert O oj einer Einheit j in der Ausgangsschicht Lo fur die in die Eingangschicht 
eingegebenen Eingangssignalmuster p gespeist wird, 

der LernprozeBabschnitt den Fehlerwert fur jede Einheit in der Ausgangsschicht und in der Zwi- 
schenschicht berechnet, 

der Fehler S 0j jeder Einheit j in der Ausgangsschicht Lo nach der Formel 
Soj - (tpj - Oo,)0 OJ (1 - 0 0J ) 

berechnet wird, 

der Fehler 8 HJ jeder Einheit der Zwischenschicht L H nach der Formel 
S HJ = O hj (1 - 0 HJ )S k (8o k -W kJ ) 

berechnet wird, 

der Kopplungskoeffizient Wj, der Einheiten in der Zwischenschicht und in der Ausgangsschicht 
durch die Formei 

Wjkh ♦ D = W JKn) + DELTA W JI(n) 

gegeben ist, 

der LernprozeB wiederholt ausgefuhrt wird, bis die Gesamtsumme E des quadratischen Fehlers 
zwischen dem als Lehrsignal gelieferten gewunschten Ausgangswert und dem Ausgangssignal hinrei- 
chend klein wird, 

dadurch gekennzeichnet, 

daB der LernprozeRabschnitt (20; 40) eine Lernvariable ft fur jeden Kopplungskoeffizienten Wj, je- 
der Einheit j in derZwischenschichtund in der Ausgangsschicht fur alle ihre Eingangswerte Oj berechnet: 
ft = 1/(Z(CV) + 1) 

und dali fur alle diese Einheiten in der Zwischenschicht und in der Ausgangsschicht die Variante 
DELTA Wj, des Kopplungskoeffizienten Wj, unter Benutzung dieser Lernvariablen ft berechnet wird: 

DELTA W JKn) = N.ft(8 PJ -O p i) + a-DELTA W JKn . „ 
worin N die Lernkonstante, a die Stabilisierungskonstante zur Verringerung der Fehleroszillationen und 
n die Anzahl der Lernvorgange bedeuten. 

Signalverarbeitungssystem nach Anspruch 1, bei dem jede der Einheiten in der Zwischenschicht 

und in der Ausgangsschicht auBerdemweitere Kopplungen aufweist, diejeweils mit Verzogerungs- 

mitteln ausgestattet sind, so daB ein rekurrentes Netzwerk gebildet wird mit 

einer Schleife LP, urn ihren Ausgangswert O, uber die genannten Verzogerungsmittel als einen ihrer 

Eingangswerte zuruckzufuhren, 

und einem Ruckkopplungspfad FB, urn ihren Ausgangswert Oj als einen Eingangswert einer an- 

deren Einheit in der gleichen Schicht oder in der Zwischenschicht zuzuf iihren. 

Signalverarbeitungssystem nach Anspruch 1 oder 2, bei dem in dem LernprozeBabschnitt (20; 40) Steu- 
ermittel zur VergroBerung der Zahl der Einheiten der Zwischenschicht vorgesehen sind, und bei dem der 

12 



EP 0 360 674 B1 



LernprozeBabschnitt den LernprozeG fur die Kopplungskoeffizienten Wj, ausfuhrt, wenn er eine Vergro- 
IJerung derZahl der Einheiten der Zwischenschicht veranlalit. 



5 Revendications 

1 . Systeme de traitement de signaux, comprenant : 

une section de traitement de signaux (10 ; 30) constituee d'un reseau neuronal multicouche ayant 
trois couches : une couche d'entree une couche intermediate L H , et une couche de sortie L 0 , les cou- 
10 ches etant constitutes respectivement d'unites u M a u !x , u H1 a Uny, u 0 i a u 0z , chaque unite correspondant 

a un neurone, le reseau etant constitue de couplages diriges vers I'avant entre les unites, chacune des 
unites j de la couche intermediate et de la couche de sortie etant congue de facon a delivrer, pour une 
forme d'entree p introduce dans la couche d'entree, un signal de sortie 0 pj represents par une fonction 
sigmoi'de selon la formule : 
15 O p , = 1/[1 + exp{ - (netj + 9j)}] 

pour la somme totale netj des signaux d'entree, ou 
9j est une valeur de seuil, 

netj est la somme totale des signaux d'entree fournis a une unite j de la couche intermediate et 
de la couche de sortie, et O pJ est le j ema element de la forme de sortie reelle produite par le representation 
20 de la forme d'entree, 

le systeme comprenant en outre : ^ 

une section de traitement d'apprentissage (20 ; 40) qui execute un traitement d'apprentissage uti- 
lisant un algorithme d'apprentissage de retropropagation, le traitement consistant a modifier les coeffi- 
cients de couplage W,, de toutes.les unites j de la couche intermediate et de la couche de sortie a I'aide 
25 d'un coefficient de variation AW,, de facon a minimiser la somme totale des erreurs quadratiques existant 

entre la valeur de sortie reelle O oJ de I'unite j de la couche L 0 produite a partir d'une forme de signal d'en- 
tree et la valeur de sortie souhaitable t p] (signal d'enseignement) pour ladite unite j de la couche l_o, si 
bien que Wj, est le poids du signal de la i erT,a unite a la j ema unite, 

la section de traitement d'apprentissage (20 ; 40) recevant une valeur de sortie voulue t pj , au titre 
so d'un signal d'enseignement, pour la valeur de sortie 0^ d'une unite j de la couche de sortie L 0 , pour les 

formes p des signaux d'entree introduites dans la couche d'entree, 

la section de traitement d'apprentissage calculant la valeur d'erreur pour chaque unite de la couche 
de sortie et de la couche intermediate, 

I'erreur 5 oJ de chaque unite j de la couche de sortie Lo etant calculee par la formule : 
35 8 CJ = (t p j-O 0j )O 0 j(1-O 0 j), 

I'erreur 8 H j de chaque unite j de la couche intermediate L H etant calculee par la formule : 

S H j=OHj(l-0 H j)2(Sok-W kj ), 

40 K 

le coefficient de couplage Wj, des unites de la couche intermediate et de la couche de sortie etant donne 
par la formule : 

W Ji(n + 1) = W Ji(n) + AW Ji(n) , 

45 le traitement d'apprentissage etant execute de fagon repetee jusqu'a ce que la somme totale E des erreurs 

quadratiques existant entre le signal de sortie souhaite qui est fourni au titre du signal d'enseignement 
et le signal de sortie devienne suffisamment petite, 
caracterise en ce que : 

la section de traitement d'apprentissage (20 ; 40) calcule une variable d'apprentissage ft pour cha- 
50 que coefficient de couplage Wj, de chaque unite j de la couche intermediate et de la couche de sortie 

pour tous ses signaux d'entree Oj : 

pj = 1/(2(<V) + 1), 

et en ce que, pour toutes ces unites de la couche intermediate et de la couche de sortie, le coefficient 
de variation AWj, du coefficient de couplage Wj, est calcule a I'aide de ladite variable d'apprentissage ft: 

55 AWj|( n ) = N.Pj(8pj.Opi) + a.AWj| (n . 1) 

oil N est la constante d'apprentissage, a est la constante de stabilisation permettant de reduire les os- 
cillations des erreurs, et n est le nombre de repetitions des operations d'apprentissage. 
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Systeme de traitement de signaux selon la revendication 1, oil chacune des unites de la couche interme- 
diaire et de la couche de sortie est en outre dotee d'autres couplages ayant chacun un moyen retardateur, 
de maniere qu'il soit forme un reseau recurrent, comportant : 

une boucle LP servant a produire, via ledit moyen retardateur, son signal de sortie Oj au titre de 
I'un de ses signaux d'entree, et 

un trajet de reaction FB servant a produire son signal de sortie Oj au titre du signal d'entree d'une 
autre unite de la meme couche ou de la couche intermediaire. 

Systeme de traitement de signaux selon la revendication 1 ou 2, oil des moyens de commande sontprevus 
dans ladite section de traitement d'apprentissage (20 ; 40) afind'augmenterle nombre des unites de ladite 
couche intermediaire, et oil ladite section de traitement d'apprentissage effectue le traitement d'appren- 
tissage du coefficient de I'intensite de couplage W,j tandis que ladite section de traitement d'apprentissage 
fait en sorte que le nombre des unites de la couche intermediaire augmente. 
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COMPUTE OUTPUT VALUE 00| 
FOR EACH UNIT 




COMPUTE ERROR VALUE 5Pi 
FOR EACH UNIT 
OUTPUT LAYER 
8oj=(tp|-Ooj) OoJ(1-Oo|) 
INTERMEDIATE LAYER 
q 

8H]=(Z8HK- Wkj)Oh|(1 -OH|) 



TERMINATE 
LEARNING 



COMPUTE LEARNING VARIABLE pi OF 
COUPLING COEFFICIENT FOR EACH UNIT 



tO P i 2 +1 



STEP5 



AWil=TvP(Sp]«Opl)+cc'AWji 






STEP6 






W|I=W|I+AWJI 





FIG. 3 
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— I STEP1 

APPLY A PATTERN TO EACH 
INPUT UNIT Oik=P(k,f) 





STEP2 


COMPUTE OUTPUT VALUE Opj(t) 
FOR EACH UNIT ™ 




STEP3 


COMPUTE ERROR VALUE 5 P j 
FOR EACH UNIT 




STEP4 


COMPUTE LEARNIN( 
COUPLING COE 
FOR EAC 


3 VARIABLE pi OF 
IFFiCIENT W|l 
H UNIT 



STEP5 

AWjl(n)=Tvp(8pj.Opl) 




AWji(n)=AWji(n)+aAWji(n-1) 
Wji=Wii+AW]i(n) 
AW ji(n + 1) = AW ji(n) 



FIG. 6 
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